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It is known that an ordinal is the order type of the lexicographic ordering of a regular language if 
and only if it is less than co"'. We design a polynomial time algorithm that constructs, for each 
well-ordered regular language L with respect to the lexicographic ordering, given by a deterministic 
finite automaton, the Cantor Normal Form of its order type. It follows that there is a polynomial 
time algorithm to decide whether two detemiinistic finite automata accepting well-ordered regular 
languages accept isomorphic languages. We also give estimates on the size of the smallest automaton 
representing an ordinal less than co"', together with an algorithm that translates each such ordinal to 
an automaton. 

1 Introduction 

One of the basic decision problems in the theory of automata and languages is the equivalence or equaUty 
problem that asks if two specifications define equal languages. In this paper we study the related "iso- 
morphism problem" of deciding whether the lexicographic orderings of the languages defined by two 
specifications are isomorphic, i.e., whether the two languages determine "isomorphic dictionaries". 

The study of lexicographic orderings of regular languages, or equivalently, lexicographic orderings 
of the leaves of regular trees goes back to [6|. Thomas [14| has shown without giving any complexity 
bounds that it decidable whether the lexicographic orderings of two regular languages (given by finite 
automata or regular expressions) are isomorphic. In contrast, the results in [2] imply that there is an 
exponential algorithm to decide whether the lexicographic orderings of two regular languages, given by 
deterministic finite automata (DFA) are isomorphic. In contrast, no such algorithm exists for context-free 
languages, cf. lEl. In this paper, one of our aims is to show that there is a polynomial time algorithm 
to decide for DFA accepting lexicographically well-ordered languages, whether they accept isomorphic 
languages with respect the lexicographic order. 

The ordinals that arise as order types of lexicographic well-orderings of regular languages are exactly 
the ordinals less than ft)®, cf. ||3j[l3. The Cantor Normal Form (CNF) ifTTI of any such nonzero ordinal 

takes the form co"" x mo H \- co"'' x nik, where k,nj and m,- are integers such that ^ > 0, m,- > 1, / = 

0,...,k, and no > ■ ■ ■ > rik >0. We provide an algorithm that, given an "ordinal automaton" representing 
a well-ordering, computes its CNF. 

We also give estimates on the size of the smallest ordinal automaton representing an ordinal less than 
O)'", together with an algorithm that translates such an ordinal to an automaton. 

In the main part of the paper we will restrict ourselves to DFA over the binary alphabet {0, 1} ac- 
cepting a complete prefix language (complete prefix code). However, this restriction is only a technical 
convenience and is not essential for the results. 
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2 Lexicographic orderings 

Suppose that £ is an alphabet linearly ordered by the relation <. We define the lexicographic ordering 
<iex of the set £* by u <iex v iff u is either a proper prefix of v or m and v are of the form u = xay, v = xbz 
with a <b inH. When L C £*, we obtain a (strict) linear ordering (L, <iex), called the lexicographic 
ordering of L. It is known that if Z has two or more letters, then every countable linear ordering is 
isomorphic to the linear ordering (L, <iex) of some language L C £*, see e.g. f3]. Moreover, we may 
restrict ourselves to prefix languages, for if L C {ai, • • • ,a„} where the alphabet is ordered as indicated, 
then (L, <iex) is isomorphic to {Lao, <iex)> where ao is a new letter which is lexicographically less than 
any other letter. Further, we may restrict ourselves to the binary alphabet, since each ordered alphabet of 
n letters can be encoded by words over {0, 1} of length [log«] in an order preserving manner. Actually, 
it suffices to consider complete prefix languages L C {0, 1}* having the property that for any m G {0, 1}*, 
mO is in the set pre(L) of all prefixes of words in L iff m1 G pre(L). 

Suppose that L C {0, 1}* is a complete prefix language. We define the complete binary tree Ti^ to be 
the tree whose vertices are the words in pre(L), such that each vertex u G pre(L) is either a leaf or has 
two successors, the words mO and u\. When L is the empty language, Ti is the empty tree. Note that Ti^ 
is an ordered tree, since the successors mO, m1 of a non-leaf vertex u are ordered by mO <iex u\. The linear 
ordering (L, <iex) is just the ordering of the leaves of T^. Note that each infinite branch of Ti determines 
an G)-word over {0, 1}. Below we will make use of the following simple fact, see also |[5l . 

Lemma 2.1 Suppose that L C {0, 1}* and consider the tree T^. Then (L, <iex) is a well-ordering iff the 
(0-word determined by each infinite branch ofT^ contains a finite number of occurrences ofO. 

Call a linear ordering regular if it is isomorphic to the lexicographic ordering of a regular (complete 
prefix) language over some ordered alphabet, or equivalently, over the alphabet {0, 1}. A regular well- 
ordering is a regular linear ordering that is a well-ordering. 

Regarding linear orderings and ordinals, we will use standard terminology. Below we review some 
simple facts for linear orderings and ordinal arithmetic (restricted to ordinals less than co"'). For all 
unexplained notions we refer to f IT]|. 

Suppose that P = (P, <p) and Q = {Q, <q) are disjoint (strict) linear orderings. Then the ordered sum 
P + Qis the linear ordering (PU 2, <), where the restriction of < to P is the relation <p and similarly 
for Q, and where x <y holds for all ;c G P and j G 2. It is known that if P and Q are well-ordered of order 
type a and jS , respectively, where a and j8 are ordinals, then P + 2 is well-ordered of order type a + p. 
In addition to sum, we will make use of the product operation. Given P and Q as above, let us define 
the following linear order < of the set P x Q: For all {x,y),{x' ,y) ^ P x Q, {x,y) < {x' ,y') iff y <q y', 
or y = y' and x <p x'. When P,Q are well-ordered of order type a,j8, respectively, then P x 2 is also 
well-ordered of order type a x j3 . 

As mentioned in the Introduction, it is known that a well-ordering is regular iff its order type is less 
than the ordinal co"'. The Cantor Normal Form (CNF) ifTTI of each nonzero ordinal less than this bound 

is of the form (o"° x mo H h ft)"* x m^, where ^ > and and ni,- are integers with no > ■ ■ ■ > n^ >0, 

mj > 1 for all / = 0, • ■ • , k. The exponent no is called the degree. 

In order to compute the CNF of the sum of two nonzero ordinals less than ft)®, it is helpful to know 
that ftj'" + ftj" = ftj" whenever m <n. Thus, when 

a = co"°xmo-\ |-ftj"*xm^ and p = co"'° x niQ -\ \- (o"'f x m'^,, 

then the CNF of a + jS can be computed as follows. First, suppose that no,- - I'^i-i are all greater than 
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Hq and Hi < u'q. If n,- = u'q, then a + jS is 

O)"" X mo H h w"'-' X m;„i + co"' x (m,- + mo) + w"' x m'j H \- of'i x m^. 

If Hi < Hq, then a + j8 is 

C0"° xntQ-l h w"'"' X m;_i + «"o X mg H h O)"' x m^. 

Finally, suppose that nk > n[y In that case a + j3 is 

O)"" X mo H h O)"* ■xmk + (o"'o x mg H h CO"'' x m^. 

In order to compute the product a x j3, it suffices to know that product distributes over sum on the left, 
and if a is the ordinal given above, then a x (o = 0)""+^ 

3 Ordinal automata 

We will be considering DFA £/ = {Q,{0, l},5,qQ,F), where Q is the finite set of states, {0,1} is the 
input alphabet, 5 is a partial function Q x {0,1} ^ Q, the transition function, qo ^ Qis the initial state, 
and F C g is the set of final states. As usual, we extend 5 to a partial function 2 x {0, 1}* — > 2 and write 
qu for 5{q,u), forqGQ and u G {0, 1}*. 

The language L{.(2/) accepted by the DFA = (2, {0, 1}, 5,qo,F) is the set {u G {0, 1}* : qgu € F}. 
As usual, we call an automaton £/ = {Q, {0, 1}, 5,qo,F) trim if each state q £ Qis both accessible and 
co-accessible, i.e., when there exist words m,v G {0, 1}* with q^u = q and qv G F. It is well-known that 
if L{£/) is nonempty, then £/ is equivalent to a trim automaton that can be easily constructed from £/ 
by removing all states that are not accessible or co-accessible. To avoid trivial situations, we will only 
consider automata that accept a nonempty language, so that we may restrict ourselves to trim automata. 

A trim automaton ^ = {Q, {0, 1}, 5,qo,F) accepts a prefix language iff neither ^0 nor ql is defined 
when q G F. Moreover, assuming that this holds, £/ accepts a complete prefix language iff for every 
q £ Q\F, both qO and ql are defined. We will call such trim automata complete prefix automata (CPA). 
It is clear that for each trim automaton accepting a prefix language one can construct a CPA = 
(g',{0, l},5',<7o,F') with 2' C 2 such that (L(^),<iex) is isomorphic to (L(i/'), <iex). To this end, 
for each state q £ Qwe form the unique sequence of states q = qi,q2, ■ ■ ■ ,qk such that for each I <i <k, 
qi+\ = qfi or = qjl, moreover, exactly one of ^,0 and qil is defined, and finally either qu £ F (in 
which case neither ^^.0 nor q^^l is defined), or both q^Q and q^^l are defined. If q^^ G F , then we remove 
the transitions used to form this sequence and declare g to be a final state. If qj, ^ F, then we replace the 
transition originating in q by the two transitions 5'{q,i) = 5{qk,i), i = 0, 1. Finally, we remove states 
that are not accessible or co-accessible. 

Suppose that £/ = {Q, {0, l},5,qo,F) is a DFA. By the size of £/ we will mean the number of states 
in Q. The strongly connected components of ^ are defined as usual. We say that a strongly connected 
component C is trivial if C consists of a single state q and q {^0,^1}. Otherwise C is called nontrivial. 
We impose the usual partial order on strongly connected components by defining C <C' iff there exist 
some <7 G C and u G {0, 1}* with qu G C. The height of a nontrivial strongly connected component C 
is the length k of the longest sequence Ci , . . . , Q of nontrivial strongly connected components such that 
Ci ^ • • • ^ Q and Q = C. From Lemma IZT] we immediately have: 

Proposition 3.1 A CPA = {Q, {0, l},5,qo,F) accepts a well-ordered language ifffor each nontrivial 
strongly connected component C and q (zC it holds that qO (and of course ql G Cj. 
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We conclude that there is a simple algorithm to decide whether a CPA accepts a well-ordered lan- 
guage which runs in polynomial time in the size of the automaton, see also HHH. It is trivial to extend 
this result to automata over larger alphabets. 

Definition 3.2 An ordinal automaton ( OA ) is a CPA £/ = {Q, {0,1}, 5, qo,F) such that whenever q be- 
longs to a nontrivial strongly connected component C, qO does not belong to C. 

By the previous proposition, a CPA £/ is an OA iff it accepts a well-ordered (complete prefix) 
language. For an OA £/, we call the order type of (L(£/), <iex) the ordinal represented by £/, denoted 
o(^). 

Lemma 3.3 For each n>0, there is an OA s^n of size n + \ representing co". 

Proof. Let have states sq,--- ,s„ with transitions d{si,l) = Si and 5{si,0) = for all I <i <n. 
The initial state is s„ and the only final state is ^o- D 

Example 3.4 Consider the ordinal a = (0^ x2 + co. An ordinal automaton representing a has 6 states, 
qQ,q\,SQ,si,S2,ST„ where qo is the initial state and sq is the only final state. The transitions are defined 
by qoO = q\, q^l = s\, q\0 = qi \ = 53, and Sil = si, sfi = Si^i for 1 = 1,2,3. 

We end this section with a construction converting a nonzero ordinal a < ft)® to an OA. First, for 
each n > 1, we construct a CPA ^„ having a single final state which accepts a language of n words. The 
CPA ^1 has a single state which is both initial and final, and no transitions. If n is even, say n = 2k, 
consider and add a new initial state sq together with transitions sqO = sqI = s'q to the old initial state 
s'q. The only final state is the final state of ^^k- If n = 2k+l for some ^ > 1, then consider &k with 
initial state s'q and final state Sf. We add two new states and and new transitions ^qO = ^i, sqI = Sf, 

SlO = Sll = Sq. 

Now let the CNF of a be ft)"" x mo H h ftj"* x mt. When ^ = and mo = 1, then we may take the 

OA of Lemma 1331 we have that o{£/no) = CC- So suppose that ^ > or mo > 1. For each <i <k, 
consider the automaton i^^, constructed above with initial state qt and final state c,, say. We may assume 
that the state sets of the are pairwise disjoint. Then we form the "ordered sum" of the ^m,> i = 0,--- ,k 
by adding k new states sq, ■ ■ ■ ,Sk-\, transitions ^'qI = si, • • • ,Sk-2^ = ^k-i, ■^oO = ^0, • ■ -,5/1-20 = qk-2, 
Sk-iO = qu-i and Sk-\f = qu- Finally, take the automaton s^no of Lemma [331 and identify its state i'„. 

with c/ for all / = 0, • • • ,k. The resulting OA has «o +^(mo) H \-g{nik) states and represents a, where 

g(l) = 1 and g(2m) = 1 +g{m), g(2m + 1) =2 + g{m) for all m > 1. 

4 From ordinal automata to CNF 

For this section, fix an OA £/ = {Q, {0, 1}, 5,qo,F). For each q ^ Q, let us denote by s/q the automaton 
{Qq,{0, l},5q,q,Fq), whcrc Qq = {qu : M G {0, 1}*}, dq is the restriction of 6 to Qq x {0, 1}, and Fq = 

QqnF. 

The following lemma is clear. 

Lemma 4.1 For each state q, £/q = {Qq, {0, 1}, 5q,q,Fq) is also an ordinal automaton. 

For each q £ Q,we let o{q) denote the order type of {Lq, <iex) = (^(=2^), <iex)- By the above lemma, 
o{q) is a (nonzero) ordinal for each q £ Q. 

Lemma 4.2 For all q £ Q and u G {0, 1}*, o{qu) < o{q) Thus, if q and q' belong to the same strongly 
connected component, then o{q) = o{q'). 
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Proof. The function vi-^ uv,v £ {0, 1}* defines an order embedding of tlie linear ordering (L^„, <iex) 
into (L^,<iex). □ 

Proposition 4.3 If C is a nontrivial strongly connected component, then there is an integer n>\ such 
that for all q it holds that o{q) = CO". Moreover, for each q £C the degree ofo{qO) is at most n — \, 
and there is some state q' £C such that the degree ofo{q'0) is n — I. 



Proof. By Definition I3.2[ we can arrange tiie states in C in a sequence ^o, • ■ • ,Sk-i such that 5,1 = 
Si+i mod k for all /. We also know that SjO ^ C for all /. Thus, 

o{so) = (o(soO) H \-o{sk-iO)) X OD = a X (0. (1) 



Since < a < (O^, this is possible only if o{so) = CO" for some n > I. It follows now by Lemma 
that o{si) = co" for all /. Using the formula ([Hi, it follows that the degree of each 0(5,0) is at most n — l. 
Moreover, there is at least one iq such that the degree of o(5,qO) is exactly n — l, since otherwise the 
degree of 0(50) would be less than □ 
When C is a strongly connected component, trivial or not, we let o(C) denote the ordinal o{q) for 
qeC. 

Suppose that the strongly connected component containing q is trivial. Below we will say that a word 
u leads from q to a strongly connected component C if qu £C but qv does not belong to any nontrivial 
strongly connected component whenever v is a proper prefix of u. When C is a trivial strongly connected 
component consisting of a single final state q', then we also say that u leads from q to the final state q'. 
The following fact is clear. 

Proposition 4.4 Suppose that the strongly connected component of the state q is trivial. Then let 
Ml, . . . ,M,t denote in lexicographic order all the words leading from q to a nontrivial strongly connected 
component, or to a final stated Then o{q) = o{qui) + • • • + o{quk)- Thus, the degree of o{q) is the 
maximum degree of the ordinals o{qui), i = 1, • • • 



We now prove a stronger version of Proposition [ 
Proposition 4.5 IfC is a nontrivial strongly connected component of height n, then o[C) = CO". 

Proof. Suppose that C is a nontrivial strongly connected component of height n. Clearly, n> I. We 
argue by induction on n to prove that o(C) > co". This is clear when n = I, since by Proposition 14.31 
o(C) = (o'" for some m > 0. Suppose now that n > I. Then let C' be a nontrivial strongly connected 
component of height n—l accessible from a state of C by some word. Then there exists a state q G C 
with qO such that C' is accessible from qO by some word. Since o{qO) >o{C') > co"^^, the degree 
of o{qO) is at least « - 1. By ©, o(C) > co". 

Next we show that for any nontrivial strongly connected component C of height n, o{C) < co". This 
is clear when n = 0. Supposing « > 0, by Propositions 14.41 and the induction hypothesis we know that 
the degree of o{sO) is at most « — 1 for each s £C. Thus, by Proposition 14. 3 [ o(C) < co". □ 

Corollary 4.6 If the degree ofo{£/) is n, then has at least n + l states. 

Proof. This is clear when n = 0. Suppose now that n > 0. Since the degree of o{£/) is n, £/ has at 
least one nontrivial strongly connected component of height n, and thus at least one nontrivial strongly 
connected component of height / for every I <i <n. Together with a final state, this gives at least n + l 
states. □ 



'The number of such words is clearly finite. 
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As a corollary of the above facts, there is an algorithm that computes the CNF of the ordinal o{£/) 
represented by the ordinal automaton £/. First, using some standard polynomial time algorithm, we 
determine the set K of all nontrivial strongly connected components together with all trivial strongly 
connected components consisting of a single final state. We also determine o(C) = ft)" for each nontrivial 
strongly connected component C S ^ by computing the height n of C. We set o(C) = 1 for all strongly 
connected components C £ K consisting of a single final state. If the initial state belongs to some C G 
K, then o{£/) = o(C). Otherwise let n denote the maximum of the heights of the nontrivial strongly 
connected components, and let « = if there is no nontrivial strongly connected component. Let Kn 
denote the set of all nontrivial strongly connected components in K of maximum height n. Using the 
algorithms specified in the Appendix as subroutines with suitable parameters, we determine for each C G 
Kn the number mc of all words u leading from the initial state qo to C, together with the lexicographically 
greatest such word uc- Then we define x„ as the lexicographically greatest word among the uc and 
m„ = Y,ceK„ ^c- By Proposition 14.51 and Proposition I4.4[ 0(^2/) = co" x m„ + a„_i for some unknown 
ordinal a„_ 1 of degree n—\. 

In the next step, we consider the set K„-\ of all strongly connected components C in A' of height 
n — \, and for each C G we compute the number mc of all those words leading from qo to C that 
are lexicographically greater than x„, together with the lexicographically greatest such word uq, if any. 
Then a„_i = O)"^' x m„_i + o;„_2» where is the sum of the integers mc, C G K^^y, and a„_2 is 
some unknown ordinal of degree n — 2. We also determine the lexicographically greatest word in the set 
consisting of x„ and all words uc, C G such that mc > 0, and we denote this word by x„_i. 

Repeating the procedure, before the last step we know that o{£/) = ftj" x m„ + • • • + ftj x mi + ao 
where ao = "^0 is an unknown finite ordinal. Moreover, we have computed a word xi . In the last step, we 
consider the set Kq of those connected components in K that consist of a single final state. We determine 
for each C G Kq the number of all words leading from qQ to C that are lexicographically greater than xi . 
Then mo = I,ceKo ^c- 

We conclude that o{s^) = co" x m„ + ■ ■ ■ + co x mi + mo. To get the CNF of a, we remove all 
summands co' x m, with m,- = 0. 

The length of each word uc determined in the above algorithm is bounded by the size of £/ and can 
be determined in polynomial time. Similarly, the length of the binary representation of each mc is at 
most the size of £/, and each mc can be computed in polynomial time in the size of £/. Thus, the overall 
algorithm runs in polynomial time. We have proved: 

Theorem 4.7 There is a polynomial algorithm that, given an ordinal automaton ^ , computes the CNF 
of the ordinal o(^) represented by si . 

Corollary 4.8 There is a polynomial time algorithm to decide for ordinal automata and SS whether 
o{,s/) = o{£^), i.e., whether (L(£/), <iex) and (L(=^), <iex) are isomorphic. 

Proof. We compute in polynomial time the CNFs of o{£/) and o{^) and check whether they are 
identical. □ 

5 Minimal ordinal automata 

For a nonzero ordinal a < ftJ®, let #(a) denote the minimum number m such that a = o{£/) for some 
m-state OA s/. In this section we reduce the determination of the function #{a) to another problem on 
automata and give some estimations on #(a) in terms of the CNF of a. 
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Definition 5.1 LetniQ, ■ ■ ■ ,mk be positive integers. Then we let /{niQ, ■ ■ ■ ,mj^) denote the minimal number 
of states of a CPA = (Q, {0, 1}, 5,qo,F) having no nontrivial strongly connected component with the 
following properties: 

1. F = {co, • • • ,Ck}, where the states Ci are pairwise different. 

2. For each i with <i <k, the language L,- has exactly m,- words: 

Li = {u€ {0, 1}* : qou = a A Vv,;' {{u <iex v A ^qv = cj) ^ j> /)} 

Note that /(mo, • • • ,mk) > k+l. Also, when m > 1, /(m) is the minimum number of states of a CPA 
accepting a language of m words. In particular, /(I) = 1. 

Lemma 5.2 Suppose that a is a nonzero ordinal with CNF (o"° x mo H h w"* x m^. Then there is a 

OA of size nQ — k + /(mo, • • • , m^) representing a. 

Proof. We take a CPA £/ = {Q, {0, 1}, 5,qo,F) having no nontrivial strongly connected component 
as in Definition [5?T] having /(mo, • • • ,mi() states, and the automaton iz^o constructed in Lemma [331 Then 
we identify c, with 5„. for all / = 0, • • • , A;. □ 

Theorem 5.3 Suppose that the Cantor normal form of a nonzero ordinal <x < (0^ is co"^ x mo H h 

O)"* x m^. Then #(«) = no — k + m where m = /(mo, • • • ,m^). 

Proof. We have already shown that #{a) < no — k + m. Thus, it remains to prove that #(«) >nQ — 
k + m. 

Suppose that £/ = (2, {0, 1}, 5,^0,^) is an OA with o{£/) = a having a least number of states 
among all such automata. By Corollary I4.6[ £/ must have at least «o + 1 states. Thus, when ^ = and 
mo = 1, #(a) >no + l = no — k + m, since /(I) = 1. So from now on we assume that k> or mo> 1. 

By the proof of Corollary 14. 6[ has at least one nontrivial strongly connected component of height 
/ for each / with 1 < i < no, and of course at least one final state. It is not possible that a nontrivial 
strongly connected component C of height n, say, contains two or more states, since otherwise we could 
select a state ^ of C such that at least one strongly connected component C' of height « — 1 is accessible 
from qO by some word u (i.e., qOu € C'), and redirect any transition going to C to the selected state q. 
After that, we could remove all states in C\ {q}, the resulting ordinal automaton would still represent 
the same ordinal, by Proposition 14.41 and Proposition 14.51 Similarly, for each 1 < / < no, there must 
be a single nontrivial strongly connected component of height /. Indeed, if C and C' were different 
nontrivial strongly connected components of the same height /, then we could remove C' and redirect 
every transition originally going to some state in C' to a state in C; the resulting smaller OA would 
represent the same ordinal. Clearly, £/ has a single final state. Also, if a state q forms a nontrivial 
strongly connected component of height /, and q' is either the state that forms the single nontrivial 
strongly connected component of height / — 1 if / > 1 or q' is the single final state if / = 0, and if ^0 
is not q', then we can redirect this transition from q under to q'. The OA obtained after removing those 
states that possibly become inaccessible from the initial state still represents a. 

In conclusion, we have that £/ contains a subautomaton consisting of states s„g,--- ,so such that 
So is the final state and for each / > 1, forms a nontrivial strongly connected component of height 
/. Moreover, Sjl = Si and sfi = for all / > 1. Let S = {so, ■■■ ,Sno}. None of the states in 2 \ 5 
is contained in any nontrivial strongly connected component, and each state is accessible from qo by 
some word. Moreover, from each state q ^ Q\S there is at least one word leading to some connected 
component {s,}, trivial or not. We claim that if ^0 = si or ql = for some q G Q\S and < / < no> then 
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there exists some < j < k with / = nj, i.e., co' appears in the CNF of a. Indeed, if qO = Sj, say, but / 
is not in the set {no, ■■■ ,nk}, then we can remove state q and redirect all transitions going to ^ to ^1, the 
resulting smaller OA still represents a, a contradiction. 

Since ^ > or /mq > 1, the initial state qo is not in S (since otherwise o{qo) = o{£/) would be a 
power of ft)). Let us order the set U of all words leading from qo to a strongly connected component 
{si}, i = 0, - ■ ■ ,no lexicographically. We know that for each u £ U, qou G 5' = {s„j : < j < k}. Then, 
by Proposition 14.41 in order to have o{qo) = cc, for each j with < j < k there must be exactly mj 
words u £ U with qou = Sn and such that there is no lexicographically greater word v £U with qov € 

, • • • ,5„ }. This means that by removing all states in S\5'' and all transitions originating in the states 
belonging S' , the resulting automaton has at least /(mo, • • ■ ,mk) states, and thus has at least no —k + m 
states. □ 

Corollary 5.4 For each n >0, there is up to isomorphism a unique OA with n+\ states representing 
co", the automaton constructed in Lemma W3\ 

5.1 The function / 

In this section, we give some estimations on the function / introduced above. 
Proposition 5.5 For all positive integers mo,-- - ,mk, 

f{mo-\ hm^.) <f{mo,---,mu) <f{mo)-\ Vf{mk)+k 

Proof. This is clear when ^ = 0, so assume that ^ > 0. To prove the upper bound, for each m, 
consider a CPA of size /(m,) without nontrivial strongly connected components and having a single 
final state c,- which accepts a language of m,- words. Without loss of generality, we may assume that 
the sets Qi are pairwise disjoint. Let ^ be the ordered sum of the constructed as above. Then for 
each /, there are exactly m, words taking the initial state to c,-, and whenever sou = Ci and sov = cj with 

u <iex V, it holds that / < / Since ^ has /(mo) H \- f{mk)+k states, we conclude that /(mo, • • • ,mk) < 

f{mo) + ---+f{mu)+k. 

To prove the lower bound, consider the automaton obtained from 3S by collapsing the final 

states Co,-- - ,Ck into a single final state. Then 3^' accepts a language of mo + h m^^ words and has 

/(mo, • • - ,mu) states. Thus, /(mo H hm^) < /(mo, • • - ,m]^). □ 

In the rest of this section, we consider the case when k = 0. In this case, / is a function on the 
positive integers. It is not difficult to see that for each « > 0, f{n) is the length of the shortest addition 
chain lOl] representing n, i.e., /(«) is the least integer k for which there there exist different integers 
\ = a\ < - - - < au = n such that for each / > 1 there exist / , j2 with a, = aj^ +ajj^. Addition chains have 
a vast literature [13] . It is not difficult to show that f{n) is at most the sum of logn and the number m 
of occurrences of the digit 1 in the binary representation of n. If n is a power of 2, then f{n) = logn. In 
the first paper lfT2l published in the journal TCS, it was shown that f{n) is at least logn + logm — 2.13, 
where m is defined as above. By Q, it is an NP-complete problem to decide for integers n,k>\ whether 
f{n) < k holds. 

6 Conclusion and open problems 

We have shown that there is a polynomial time algorithm to decide if two ordinal automata represent the 
same ordinal. Since it is decidable in polynomial time whether the lexicographic ordering of the language 
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accepted by a DFA is well-ordered, and since every DFA accepting a well-ordered regular language can 
be transformed in polynomial time to an ordinal automaton, the restriction to ordinal automata was 
inessential. 

A linear ordering is called scattered if it does not have a subordering isomorphic to the dense ordering 
of the rationals. By Hausdorff's theorem ifTTI . every linear ordering is a dense sum of scattered linear 
orderings. Call a language scattered if its lexicographic ordering has this property. 

Hausdorff classified countable linear orderings according to their rank. It follows from results proved 
in ifTOl that the rank of the lexicographic ordering of a scattered regular language is always finite. It 
is known (cf. |[T1) that a CPA .sif accepts a scattered regular language iff for each nontrivial strongly 
connected component C and q £ C, either qO ^ C or ql ^ C. It would be interesting to know whether 
there is a polynomial time algorithm to decide whether two DFA accepting scattered languages accept 
isomorphic languages. 

Acknowledgement The author would like to thank all three referees for suggesting improvements and 
Szabolcs Ivan for the references on addition chains. 
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Appendix 

Suppose that .sif = (2, {0, 1}, ,F) with Q = {q\,. . . ,q„} is a DFA having no nontiivial strongly 
connected component over the binary alphabet {0, 1} such that all final states are sinks, i.e., whenever q 
is a final state, neither ^0 nor ^1 is defined. 

Algotithm 1 Input: A word u (of length less than n) such that neither q\M nor q\u\ is defined. 
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Output: The number of different words v accepted by £/ such that u <iex v. 

Method: Let Mq and Mi denote the Q x Q transition matrices of with respect to the letters and 
1, respectively. Let u = ui ■ ■ ■ Uk, where each m; is either or 1. For each i with \ <£ <k and ue = 0, 
consider the sum = ■ ■ -Mu^ ^Mi E"lf (Mq + Mi)-', where matrix sum and product are computed 
in the semiring of natural numbers. Since each word of length n or longer induces the empty partial 
function on the set of states, is clear that for each 1 < /, j < n, {Ne)ij is the number of words accepted by 
ssf with initial state qi and final state qj of the form mi • • • ue-ilx. 

Let e denote the ^-dimensional row vector whose first entry is 1 and whose other entries are 0, and let 
/ denote the Q-dimensional 0-1 column vector whose ^,th component is 1 iff qi G F, for !</<«. Then, 
Luf=o^^// is the number of all words accepted by lexicographically greater than u. By the above 
consideration, and since each number occurring in the computation is at most 2" that can be represented 
by n + 1 bits, this number can be computed in polynomial time in the number n of states. 

Algorithm 2 Input: A state q^F. 

Output: The lexicographically greatest word u with q\u = q. 

Method: First, in polynomial time, compute the g x Q binary reachability matrix M such that Mq.^q. = 
1 iff there is a word u with qiU = qj. Then form a finite sequence of states si^--- together with letters 
Ml, - • ■ ,Uk-i such that s\ = qi, and if si ^ q then j,_|_i = 5^,1 and m, = 1 if M^^i^q = 1; and j,+i = sfi and 
Ui = otherwise. The length k of this sequence is at most n and ui ■■■Uk-i is the lexicographically 
greatest word u with q\u = q. 



